NATURAL STUPIDITY: CITIES & WINGS
DATA SOURCES
1. Google Earth Engine
Dataset: GHSL: Degree of Urbanisation 1975-2030 V2-0 (P2023A)
It classifies the whole world into urban and rural categories by dividing them into 1 km² grid cells, based on the Degree of Urbanisation (DEGURBA) Stage I methodology as recommended by the UN Statistical Commission.

The table shows the classifications and their code named "smod".
2. eBird API
eBird is an online database of bird observations run by The Cornell Lab of Ornithology at Cornell University and the National Audubon Society, providing scientists, researchers and amateur naturalists with real-time data about bird distribution and abundance. It was first launched in 2002 and has already contained 100 million observations worldwide.
DATA COLLECTION
Step 1: Location and Time
This is a UK-based project since both geographic and bird data are adequate in the UK, where we live.
The time period starts from January 1, 2020, to December 31, 2024. This is a long enough time to capture periodic changes in observation abundance and richness by seasons, without adding too much difficulty in data collection and processing.
Step 2: Geographic Data
We divide the whole UK into 4,533 10 km x 10 km grids. A calculation was done to assign a smod value to each block based on the original value, aiming to prioritise identifying urban areas. Data in 2025 is used as urbanisation is assumed to occur steadily and will not change much in 5 years.
Step 3: Birds Observation Data
We use `requests` to fetch the historical bird observations in this 5-year period. 376,944 observations were collected.
What does raw data look like?
The fetched data looks like this (easier-to-read names are used instead of the original variable names):

DATA PROCESSING & STORAGE
-
Validity Check: We first checked the “Is the observation valid?” column and made sure all observations were valid. (i.e. reviewed automatically by eBird)
-
Date Extraction: We create a new column named “observation date”, extracted from the “observation time” column as the specific time is of no interest in the project.
-
Creating Links: By comparing the latitudes and longitudes of each observation with the block information, each observation is assigned to the block in which it occurs. (18.8% of the observations don’t have a block and are deleted. Reasons are explained in NB02.)
-
We also integrate a dataset from Wikipedia, including the summaries of all bird species we have in observation data, as a supplement.
-
Data Storage: Store Data in an SQL database with the following schema:
